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Abstract 

We show the NP-conipleteness of the existential theory of term algebras with the Knuth- 
Bendix order by giving a nondeterministic polynomial-time algorithm for solving Knuth- 
Bendix ordering constraints. 
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1 Introduction 
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Solving ordering constraints in term algebras with various reduction orders is used in rewriting 
to prove termination of recursive definitions and in automated deduction to prune the search 
^sO ■ space [Comon 1990, Kirchner 1995, Nieuwenhuis 1999]. Nieuwenhuis [1999] connects further 

^ I progress in automated deduction with constraint-based deduction. 

■ Two kinds of orders are used in automated deduction: the Knuth-Bendix order [Knuth and 

CN . Bendix 1970] and various versions of recursive path orders [Dershowitz 1982, Kamin and Levy 

1980]. The Knuth-Bendix order is used in the state-of-the-art theorem provers, for example. 
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g ; E [Schulz 1999], SPASS [Weidenbach, Afshordel, Brahm, Cohrs, Engel, Keen, Theobalt and 

Topic 1999], Vampire [Riazanov and Voronkov 1999], and Waldmeister [Hillenbrand, Buch, Vogt 
and Lochner 1997]. There is extensive literature on solving recursive path ordering constraints 
^ ' (e.g., [Comon 1990, Jouannaud and Okada 1991, Nieuwenhuis 1993, Narendran, Rusinowitch 

■ and Verma 1999]). The decidabihty of Knuth-Bendix ordering constraints was proved only 

recently in [Korovin and Voronkov 2000]. The algorithm described in that paper shows that 
the problem belongs to 2-NEXPTIME. It was also shown that the problem is NP-hard by 
reduction of the solvability of systems of linear Diophantine equations to the solvability of the 
Knuth-Bendix ordering constraints. In this paper we present a nondeterministic polynomial- 
time algorithm for solving Knuth-Bendix ordering constraints, and hence show that the problem 
is contained in NP for every term algebra with a Knuth-Bendix order. As a consequence, we 
obtain that the existential first-order theory of any term algebra with a Knuth-Bendix order 
is NP-complete too. Let us note that the problem of solvability of a Knuth-Bendix ordering 
constraints consisting of a single inequality can be solved in polynomial time [Korovin and 
Voronkov 2001]. 

This paper is structured as follows. In Section ^ we define the main notions of this paper. In 
Section ^ we introduce the notion of isolated form of constraints and show that every constraint 
can be effectively transformed into an equivalent disjunction of constraints in isolated form. 
This transformation is represented as a nondeterministic polynomial-time algorithm computing 
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2 Preliminaries 



members of this disjunction. After this, it remains to show that solvabihty of constraints in 
isolated form can be decided by a nondeterministic polynomial-time algorithm. In Section ^ we 
present such an algorithm using transformation to systems of linear Diophantine inequalities 
over the weights of variables. Finally, in Section ^ we complete the proof of the main result and 
present some examples. Section ^ discusses related work and open problems. 

2 Preliminaries 

A signature is a finite set of function symbols with associated arities. In this paper we assume 
an arbitrary but fixed signature S. Constants are function symbols of the arity 0. We assume 
that T, contains at least one constant. We denote variables by x, y, z and terms by r, s, t. The set 
of all ground terms of the signature E can be considered as the term algebra of this signature, 
TA(S), by defining the interpretation g^^'^^ of any function symbol g by ^"''^^^^(ti, . . . ,tn) = 
g{t\^ . . . ,tn)- For details see e.g. [Hodges 1993] or [Maher 1988]. It is easy to see that in term 
algebras any ground term is interpreted by itself. 

Denote the set of natural numbers by N. The Knuth-Bendix order is a family of orders 
parametrized by two parameters: a weight function and a precedence relation. 

Definition 2.1 (weight function) We call a weight function on S any function ti; : E — s- N such 
that (i) w{a) > for every constant a G S, (ii) there exist at most one unary function symbol 
f (zTi such that w{f) = 0. Given a weight function w, we call w{g) the weight of g. The weight 
of any ground term t, denoted \t\, is defined as follows: for every constant c we have |c| = w{c) 
and for every function symbol g oi a positive arity \g(ti, . . . , = w{g) + + . . . + |t„|. □ 

These conditions on the weight function ensure that the Knuth-Bendix order is a simplifi- 
cation order total on ground terms (see, e.g., [Baader and Nipkow 1998]). In this paper, / will 
always denote a unary function symbol of the weight 0. 

The following lemma is straightforward. 

Lemma 2.2 Every weight function satisfies the following properties. 

1. The weight of every term is positive. 

2. If S contains no unary function symbol of the weight 0, then for every natural number n 
there is only a finite number of terms of the weight n. If Ti contains the unary function 
symbol of the weight 0, then every weight contains either no terms at all or an infinite 
number of different terms. 

3. If a term s is a subterm oft and \s\ = \t\, then t has the form f^{s) for some m (recall 
that f is the function symbol of the weight Q). □ 

Definition 2.3 A precedence relation on S is any total order ^ on S. A precedence relation 
^ is said to be compatible with a weight function w if the existence of a unary function symbol 
/ of the weight zero implies that / is the greatest element w.r.t. □ 

In the sequel we assume a fixed weight function w on S and a fixed precedence relation ^ 
on S, compatible with w. 
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Definition 2.4 The Knuth-Bendix order on TA(S) is the binary relation >- defined as fohows. 
For any ground terms t = g{ti, . . . , tn) and s = h{si, . . . , s^) we have t y s one of the fohowing 
conditions holds: 

1. |t| > \s\; 

2. |t| = \s\ and g ^ h; 

3. jt| = \s\, g = h and for some 1 < i < n we have ii = si, . . . ,ti-i = Si-i and ti >- Si. 

□ 

Note that the Knuth-Bendix order is a total monotonic well-founded order, see, e.g., [Baader 
and Nipkow 1998]. Some authors [Martin 1987, Baader and Nipkow 1998] define Knuth-Bendix 
orders with real-valued weight functions. We do not consider such orders here, because for real- 



valued functions even the comparison of ground terms can be undecidable (see Example 5.6 in 
Section |5|) . 

The main result of this paper is the following. 



Theorem 5.2: The existential first- order theory of any term al- 
gebra with the Knuth-Bendix order in a signature with at least two 
symbols is NP-complete. 



To prove this result, we introduce a notion of Knuth-Bendix ordering constraint and show 
the following. 



Theorem |5.l| : For every Knuth-Bendix order, the problem of solv- 
ing ordering constraints is contained in NP. 



We also show that the systems of linear Diophantine equations and inequalities can be 
represented as ordering constraints for some Knuth-Bendix orders, and as a corollary we obtain 
the following. 



Theorem 5.4: For some Knuth-Bendix orders, the problem of 
solving ordering constraints is NP-complete. 



The proof of Theorem |5.2| will be given after a series of lemmas. The idea of the proof is 
as follows. First, we will make TA(S) into a two-sorted structure by adding the sort of natural 
numbers, and extend its signature by 

1. the weight function | • | on ground terms; 

2. the addition function -|- on natural numbers; 

3. the Knuth-Bendix order >- on ground terms. 

Given an existential formula of the first-order theory of a term algebra with the Knuth-Bendix 
order, we will transform it step by step into an equivalent disjunction of existential formulas of 
the extended signature. The main aim of these steps is to replace all occurrences of >- by linear 
Diophantine inequalities on the weights of variables. After such a transformation we will obtain 
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2 Preliminaries 



existential formulas consisting of linear Diophantine inequalities on the weight of variables plus 
statements expressing that, for some fixed natural number N, there exists at least N terms 
of the same weight as \x\, where re is a variable. We will show how these statements can be 
expressed using systems of linear Diophantine inequalities on the weights of variables and then 
use the fact that the decidability of systems of linear Diophantine equations is in NP. 

We denote by TA^(S) the following structure with two sorts: the term algebra sort and the 
arithmetical sort. The domains of the term algebra sort and the arithmetical sort are the sets 
of ground terms of S and natural numbers, respectively. The signature of TA'^(S) consists of 

1. all symbols of S interpreted as in TA(S); 

2. symbols 0, 1, >, + having their conventional interpretation over natural numbers; 

3. the binary relation symbol >- on the term algebra sort, interpreted as the Knuth-Bendix 
order; 

4. the unary function symbol | • |, interpreted as the weight function mapping terms to 
numbers. 

When we need to distinguish the equality = on the term algebra sort from the equality on the 
arithmetical sort, we denote the former by =tA) and the latter by =p}. 

We will prove that the existential theory of TA''~(5]) is in NP, from which the fact that 
the existential theory of any term algebra with the Knuth-Bendix order belongs to NP follows 
immediately. We consider satisfiability, validity, and equivalence of formulas with respect to the 
structure TA"^(S). We call a constraint in the language of TA"'"(S) any conjunction of atomic 
formulas of this language. 

Lemma 2.5 The existential theory o/TA"'"(S) is in NP if and only if so is the constraint sat- 
isfiability problem. 

Proof. Obviously any instance A of the constraint satisfiability problem can be considered as 
validity of the existential sentence 3xi . . . XnA, where all variables of A, so the 

"only if" direction is trivial. 

To prove the "if" direction, take any existential formula 3xi, . . . ,XnA. This formula is 
satisfiablc if and only if so is the quantifier-free formula A. By converting A into disjunctive 
normal form we can assume that A is built from literals using A, V. Replace in A 

1. any formula ^s y t hy s =ta t\/ t y s, 

2. any formula -is =ta thysyt\/tys, 

3. any formula -^p > q hy p =^ q\J q> p, 

4. any formula -^p =^ qhy p > q\/ q> p, 

and convert A into disjunctive normal form again. It is easy to see that we obtain a disjunction 
of constraints. The transformation gives an equivalent formula since both orders >- and > are 
total. 

It follows from these arguments that there exists a nondeterministic polynomial-time algo- 
rithm which, given an existential sentence A, computes on every branch a constraint Ci such 
that A is valid if and only if one of the constraints Q is satisfiable. □ 
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A substitution is a mapping from the set of variables to the set of terms. A substitution 9 is 
called grounding for an expression C (i.e., term or constraint) if for every variable x occurring 
in C the term 9{x) is ground. Let be a substitution grounding for an expression C. We 
denote by C9 the expression obtained from C by replacing in it every variable x by 9{x). A 
substitution 9 is called a solution to a constraint C if is grounding for C and is valid in 
TA+(S). 

In the sequel we will often replace a constraint C{x) by a formula A{x,y) containing extra 
variables y and say that they are "equivalent". By this we mean that TA''~(S) \= Vx(C(x) ^ 
3yA{x,y)). In other words, the set of solutions to C is exactly the set solutions to A projected 
on X. 

3 Isolated forms 

We are interested not only in satisfiability of constraints, but also in their solutions. Our algo- 
rithm will consist of equivalence-preserving transformation steps. When the signature contains 
no unary function symbol of the weight 0, the transformation will preserve equivalence in the 
following strong sense. At each step, given a constraint C{x), we transform it into constraints 
Ci{x, y), . . . , Cnix, y) such that for every sequence of ground terms i, the constraint C{t) holds 
if and only if there exist k and a sequence of ground terms s such that Ck{i^ s) holds. In other 
words, the following formula holds in TA''~(S): 

C(x) ^3y(Ci(x,y) V... VC„(x,y)). 

Moreover this transformations will be presented as a nondeterministic polynomial-time algo- 
rithm which computes on every branch some Ci{x,y), and every Ci{x,y) is computed on at 
least one branch. When the signature contains a unary function symbol of the weight 0, the 
transformation will preserve a weaker form of equivalence: some solutions will be lost, but solv- 
ability will be preserved. More precisely, we will introduce a notion of an /-variant of a term 
and show that the following formula holds: 

C{x) ^ 3y3z{f-variant{x,z) A (Ci(z,y) V ... V C„(z,y))), (1) 

where f-variant{x, z) expresses that x and z are /-variants. 

In our proof, we will reduce solvability of Knuth-Bendix ordering constraints to the problem 
of solvability of systems of linear Diophantine inequalities on the weights of variables. Condition 
H of the definition of the Knuth-Bendix order \t\ > \s\ has a simple translation into a linear 
Diophantine inequality, but conditions ^ and |^ do not have. So we will split the Knuth- 
Bendix order in two partial orders: corresponding to condition |l| and yiex corresponding 
to conditions ^ and |^. Formally, we denote by t y^j s the formula \t\ > \s\ and by t ^lex s the 
formula |t| =n \s\At >~ s. Obviously, ti >- t2 if and only if ti ^lex ^2 V ti t2- So in the sequel 
we will assume that >- is replaced by the new symbols ^lex and 

We use xi y X2 y ■ ■ ■ y Xn to denote the formula xi >- X2 A X2 X3 A . . . A Xn-i >- x„, and 
similar for other binary symbols in place of y. 

A term t is called flat if t is either a variable or has the form g{xi, . . . , Xm), where (7 E S, 
m > 0, and variables. We call a constraint chained if 

1. it has a form ti#t2# • • • #in; where each occurrence of # is >~iex or =ta; 



6 



3 Isolated forms 



2. each term ti is flat; 

3. if some of the tj's has the form g{xi, . . . , Xn), then xi, . . . ,Xn are some of the tj's. 
Denote by _L the logical constant "false" . 

Lemma 3.1 Any constraint C is equivalent to a disjunction CiV. . .VCfe of chained constraints. 
Moreover, there exists a nondeterministic polynomial-time algorithm which, for a given C, 
computes on every branch either _L or some Ci; and every Ci is computed on at least one 
branch. 

Proof. First, we can apply flattening to all terms occurring in C as follows. If a nonflat term 
g{ti, . . . , t„i) occurs in C, take any i such that ti is not a variable. Then replace C hy v = AC", 
where i; is a new variable and C is obtained from C by replacing all occurrences of ti hy v. 
After a finite number of such replacements all terms will become flat. 

Let s,t be flat terms occurring in C such that no comparison s#t occurs in C. Using the 
valid formula s y^j tV s ^lex t\/ s =ta t \/ 1 s \/ 1 y s we can replace C by the disjunction 
of the constraints 

syyjtAC, syiextAC, S=TAtAC, 

ty^ sAC, t yiex sAC. 

By repeatedly doing this transformation we obtain a disjunction of constraints Ci V . . . V in 
which for every terms s,t and every i G {1,. . . ,k} some comparison constraint s^t occurs in 
Ci. 

To complete the proof we show how to turn each Ci into a chained constraint. Let us call a 
cycle any constraint si#S2# • ■ ■ #Sn#si, where n > 1. We can remove all cycles from Ci using 
the following observation: 

1. if all # in the cycle are =tA; then Sni^si can be removed from the constraint; 

2. if some # in the cycle is y^ or yiex, then the constraint Ci is unsatisfiable. 

After removal of all cycles the constraint Cj can still be not chained because it can contain tran- 
sitive subconstraints of the form si#S2# ■ ■ ■ #Sn A si#s„, n > 2. Then either Ci is unsatisfiable 
or si^Sn can be removed using the following observations: 

1. Case: is si y^ Sn. If some # in si#S2# • • • #5^ is y^, then si yu, Sn follows from 
si#S2# • • • otherwise si#S2#---#s„ implies |si| = |s„| and hence Cj is unsatisfi- 
able. 

2. Case: si#Sn is si yiex Sn. If some # in si#S2# • • • #Sn is >-«;, then Ci is unsatisfiable. 
If all # in Si#S2#...#Sn are =ta, then Ci is unsatisfiable too. Otherwise, all # in 
si7^S2t^ • • • T^Sn are either yiex or =tAi and at least one of them is yiex- It is not hard to 
argue that si yiex Sn follows from si#S2# . . . 

3. Case: si^Sn is si =ta Sn- If all # in si#S2# • • • #s„ are =ta, then si =ta Sn follows 
from si#S2# • ■ ■ otherwise Q is unsatisfiable. 
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It is easy to see that after the removal of ah cycles and transitive subconstraints the constraint 
Ci becomes chained. 

Note that the transformation of C into the disjunction of constraints Ci V . . . V in the 
proof can be done in nondeterministic polynomial time in the following sense: there exists a 
nondeterministic polynomial-time algorithm which, given C computes on every branch either 
1. or some Cj, and every Ci is computed on at least one branch. □ 

We will now introduce several special kinds of constraints which will be used in our proofs 
below, namely arithmetical, triangle, simple, and isolated. 

A constraint is called arithmetical if it uses only arithmetical relations =n and >, for example 
> |a| + 3. 

A constraint yi =ta ii A . . . A y„ =ta tn is said to be in triangle form if 

1. 7/1, . . . , y„ are pairwise different variables, and 

2. for all j > i the variable yi does not occur in tj. 

The variables yi, . . . ,yn are said to be dependent in this constraint. 
A constraint is said to be simple if it has the form 

Xll ^lex Xi2 yiex ■ ■ ■ ^lex A ... A Xfcl ^lex Xk2 ^lex ■ ■ ■ ^lex Xkuk, 

where xn, . . . , Xkn^ are pairwise different variables. 

A constraint is said to be in isolated form if either it is 1. or it has the form 

Carith A Ctriang A Csimpt 

where Carith is an arithmetical constraint, Ctriang is in triangle form, and Cgimp is a simple 
constraint such that no variable of Cgimp is dependent in Ctriang ■ 

Our decision procedure for the Knuth-Bendix ordering constraints is designed as follows. 



By Lemma |3.l| we can transform any constraint into an equivalent disjunction of chained con- 
straints. Our next step is to give a transformation of any chained constraint into an equivalent 
disjunction of constraints in isolated form. Then in Section ^ we show how to transform any 
constraint in isolated form into an equivalent disjunction of systems of linear Diophantine in- 
equalities on the weights of variables. Then we can use the result that the decidability of 
systems of linear Diophantine inequalities is in NP. 

Let us show how to transform any chained constraint into an equivalent disjunction of 
isolated forms. The transformation will work on the constraints of the form 

C chain A Carith A Ctriang A Cgirap, (2) 

such that 

1 • Carith , Ctriang , Csimp are as in the definition of isolated form; 
2- C chain is a chained constraint; 

3. each variable of C chain neither occurs in Csimp nor is dependent in Ctriang- 
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We will call such constraints ^ working. Let us call the size of a chained constraint C the 
total number of occurrences of function symbols and variables in C. Likewise, the essential size 
of a working constraint is the size of its chained part C chain ■ 

At each transformation step we will replace working constraint (Q) by a disjunction of 
working constraints but of smaller essential sizes. Evidently, when the essential size is 0, we 
obtain a constraint in isolated form. 

Let us prove some lemmas about solutions to constraints of the form (|2|). Note that any 
chained constraint is of the form 

*ll#il2# • • • #ilmi 

(3) 

^fcl#ifc2# • • • #tkmk, 

where each ^ is either =ta or yiex and each tij is a flat term. We call a row in such a constraint 
any maximal subsequence iji#ij2# • • • H^Umi in which does not occur. So constraint (^) 
contains k rows, the first one is tii#ti2# • • • #iimi and the last one tfci#iA;2# • • • H^tkm^- Note 
that for any solution to (^) all terms in a row have the same weight. 

Lemma 3.2 There exists a polynomial-time algorithm which transforms any chained constraint 
C into an equivalent chained constraint C such that the size of C is not greater than the size 
of C , either C' is _L or of the form and C has the following property. Suppose some term 
of the first row t\j of C is a variable y. Then either 

1. y has exactly one occurrence in C , namely tij itself; or 

2. y has exactly two occurrences in C , both in the first row: some tin has the form f{y) for 
n < j, and w{f) = 0; moreover in this case there exists at least one >-iex between tin <ind 
tlj. 

Proof. Note that if y occurs in any term t{y) which is not in the first row, then C is unsatis- 
fiable, since for any solution 6* to C we have \y9\ > \t{y)6\, which is impossible. Suppose that y 
has another occurrence in a term tin of the first row. Consider two cases. 

1. tin coincides with y. Then either C has no solution, or part of the first row between tin 
and tlj has the form y =ta ■ ■ ■ =ta U- In the latter case part y =ta can be removed from 
the first row, so we can assume that no term in the first row except tij is y. 

2. tin is a nonvariable term containing y. Since tin and y are in the same row, for every 



solution ^ to C we have \y0\ = |tin^|- Since ti„ is a flat term, by Lemma 2^ the equality 
\yd\ = l^in^l is possible only if tin is f{y) and n < j. Finally, if f{y) has more than one 
occurrences in the first row, we can get rid of all of them but one in the same way as we 
got rid of multiple occurrences of y. 

Note that the transformation presented in this proof can be made in polynomial time. It is also 
not hard to argue that the transformation does not increase the size of the constraint. □ 

We will now take a working constraint C chain A Carith A Ctriang A Cgimp i whose chained part 



satisfies Lemma 3.2 and transform it into an equivalent disjunction of working constraints of 
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smaller essential sizes in Lemma 3.5 below. More precisely, these constraints will be equivalent 
when the signature contains no unary function symbol of the weight 0. When the signature 
contains such a symbol /, a weaker notion of equivalence will hold, see formula (||) on page ^. 

A term s is called an f -variant of a term t if s can be obtained from t by a sequence of 
operations of the following forms: replacement of a subterm /(r) by r or replacement of a 
subterm r by /(r). Evidently, /-variant is an equivalence relation. Two substitutions Oi and 
02 are said to be /-variants if for every variable x the term x6i is an /-variant of x92- In the 
proof of several lemmas below we will replace a constraint C{x) by a formula A(x, y) containing 
extra variables y and say that C{x) and A{x,y) are equivalent up to f. By this we mean the 
following. 

1. For every substitution 9i grounding for x such that TA^(S) \= C{x)9i, there exists a 
substitution 02 grounding for x,y such that TA^(S) \= A{x,y)92, and the restriction of 
02 to X is an /-variant of Oi. 

2. For every substitution 62 grounding for x,y such that TA'''(S) \= A{x,y)92, there exists 
a substitution 9i such that TA^($]) \= C{x)9i and 9i is an /-variant of the restriction of 
^2 to X. In other words, formula (|^) on page ^ holds. 

Note that when the signature contains no unary function symbol of the weight 0, equivalence 
up to / is the same as equality of terms in TA'''(S). 

Lemma 3.3 Let C = C chain ^Carith^Ctriang^C simp be a working constraint and 9i be a solution 
to C. Let 92 he an f -variant of 9i such that 



1. 92 is a solution to C chain o.nd 

2. 92 coincides with 9i on all variables not occurring in C chain- 



Then there exists an f -variant 9^ of 92 such that 



1. 9^ is a solution to C and 

2. 9^ coincides with 92 on all variables except for the dependent variables ofCtriang- 

Proof. Let us first prove that ^2 is a solution to both Carith and Csimp- Since Csimp and C chain 
have no common variables, it follows that 9i and ^2 agree on all variables of Csimp, and so 92 
is a solution to Csimp- Since 9i and 02 are /-variants and the weight of / is 0, for every term t 
we have \t9i\ = \t92\, whenever t9i is ground. Therefore, ^2 is a solution to Carith if and only if 

so is 6*1. So 02 is a solution to Carith- 

It is fairly easy to see that 02 can be changed on the dependent variables of Ctriang obtaining 
a solution ^3 to C which satisfies the conditions of the lemma. □ 

This lemma will be used below in the following way. Instead of considering the set 0i of all 
solutions to C chain wc Can restrict ourselves to a subset 02 of 0i as soon as for every solution 
01 G 01 there exists a solution 92 G @2 such that ^2 is an /-variant of 9i. 

Let us call an f-term any term of the form f{t). By the f -height of a term t we mean the 
number n such that t = /"(s) and s is not an /-term. Note that the /-terms are exactly the 
terms of a positive /-height. We call the f -distance between two terms s and t the difference 
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3 Isolated forms 



between the /-height of s and /-height of t. For example, the /-distance between the terms 
/(a) and f{f{g{a,b)) is -1. 

Let us now prove a lemma which implies that any solution to C can be transformed into a 
solution with a "small" /-height. 

Lemma 3.4 Let C chain be a chained constraint of the form 

Pl#Pl-l# ■ ■ ■ #Pl >-w-, 



where each ^ is either =ta or >-iex- Further, let C chain satisfy the conditions of Lemma 
and 6 be a solution to C chain- Then there exists an f -variant 9' of 9 such that 

1. 9' is a solution to C chain o-nd 

2. for every k E {1, ...,/}, the f -height of pk9' is at most k. 
Proof. Let us first prove the following statement 

(4) The row • . . #pi has a solution 9i, such that (i) 9i is an /-variant of 9, (ii) 
for every 1 < k < I the /-distance between Pk9i and pk-i9i is at most 1. 

Suppose that for some k the /-distance between pk9 and Pk-i9 is d > 1. Evidently, to prove 
(Q) it is enough to show the following. 

(5) There exists a solution 92 such that (i) ^2 is an /-variant of 9, (ii) the /-distance between 
Pk92 and Pk-i92 is d — 1, and (iii) for every k' ^ k the /-distance between Pki92 and 
Pk'~i92 coincides with the /-distance between pkr9 and Pk'-i9. 

Let us show (|5|), and hence (^). Since ^ is a solution to the row, then for every k'" > k the /- 
distance between any Pk"'9 and Pk9 is nonnegative. Likewise, for every k" < k — 1 the /-distance 
between any Pk-i9 and pk"9 is nonnegative. Therefore, for all k'" > k > k" , the /-distance 
between Pk"'9 and pk"9 is > d, and hence is at least 2. Let us prove the following. 

(6) Every variable x occurring in pi#pi-ii^ ■ ■ ■ i^Pk does not occur in Pk-i# ■ ■ ■ 

Let X occur in terms pi and pj such that I > i > k and k — 1 > j > 1. Since the constraint 
satisfies Lemma 3.2, then pi = f(x) and pj = x. Then the /-distance between pi9 and pj9 is 1, 
but by our assumption it is at least 2, so we obtain a contradiction. Hence (^) is proved. 
Now note the following. 

(7) If for some k'" > k a variable x occurs in pk'" , then x9 is an /-term. 

Suppose, by contradiction, that x9 is not an /-term. Note that pk"'9 has a positive /-height, 
so pk'" is either x of f{x). But we proved before that the /-distance between pk'" and Pk-i is 
at least 2, so x must be an /-term. 

Now, to satisfy IM), define the substitution 92 as follows: 



9{x), if X does not occur in pi, . . . ,Pk', 

t, if X occurs in pi, . . . ,pk and 9{x) = f{t). 
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By (|) and (^), 62 is defined correctly. We claim that 62 satisfies (|5|). The properties (i)-(iii) 
of (I) are straightforward by our construction, it only remains to prove that 62 is a solution to 
the row, i.e. for every k' we have Pfc'^2#Pfc'-i^2- Well, for k' > k we have pk'9 = and 
Pk'~i(^ = /(Pfc'-i^2)) and for k' < k we have p^'O = Pfc'^2 and Pk'-iO = Pk'-if^2, in both cases 
Pfc'^2#Pfc'-i^2 follows from pj^/O^p^'-iO. The only difficult case is A; = k'. 

Assume k = k' . Since the /-distance between pk9 and Pk-id is c? > 1, we have pkO ^ Pk-if^, 
and hence Pk#Pk-i must be pk >-iex Pk-i- Since 6* is a solution to pk ^lex Pk-i and since 62 is 
an /-variant of 9, the weights of Pk02 and pk-i02 coincide. But then Pkd2 >-iex Pk-i&2 follows 
from the fact that the /-distance between Pk02 and Pk-i02 is d — 1 > 1. 

Now the proof of (^), and hence of (^), is completed. In the same way as (Q), we can also 
prove 

(8) The constraint C chain has a solution 9' such that (i) 9' is an /-variant of 9, (ii) for every 
1 < < / the /-distance between Pk9i and Pk-i9' is at most 1. (iii) the /-height of pi9' 
is at most 1; (iv) 9' and 9 coincide on all variables occurring in the rows below the first 
one. 



It is not hard to derive Lemma 3^ from (^). □ 



The following lemma is the main (and the last) lemma of this section. 

Lemma 3.5 Let C = C chain A Carith A Ctriang A Cgimp be a working constraint in which C chain is 
nonempty. There exists a nondeterministic polynomial-time algorithm which transforms C into 
a disjunction of working constraints having Cchain of smaller sizes and equivalent to C up to f. 

Proof. The proof is rather complex, so we will give a plan of it. The proof is presented as a 
series of transformations on the first row of Cchain- These transformations may result in new 
constraints added to Carith-, Ctriang, and Csimp- First, we will get rid of equations s =ta t in 
the first row, by introducing quasi-flat terms, i.e. terms f^{t), where t is flat. If the first row 
contained no function symbols, then we will replace the first row by new constraints added to 
Csimp and Carith, thus decreasing the size of the chained part. If there were function symbols 
in the first row, we will continue as follows. 

We will "guess" the values of some variables x of the first row, i.e. replace them by some 
quasi-flat term f^{g{y)), where y is a sequence of new variables. After these steps, the size of 
the first row can, in general, increase. Then we will show how to replace the first row by new 
constraints involving only variables occurring in the row, but not function symbols. Finally, we 
will prove that the number of variables from the new constraints that remain in the chained 
part is not greater than the original number of variables in the first row, and therefore the size 
of the chained part decreases. 

Formally, consider the first row of Cchain- Let this row be Pi#Pi-i# ■ ■ ■ #Pi- Then Cchain 
has the form pi^pi^i^ . . . #pi ti# . . . #tn. If / = 1, i.e., the first row consists of one term, 
we can remove this row and add \pi\ > \ti\ to Carith obtaining an equivalent constraint with 
smaller essential size, that is, the size of Cchain- So we assume that the first row contains at 
least two terms. 



As before, we assume that / is a unary function symbol of the weight 0. By Lemma 3.4, if 
some Pi is either a variable x or a term f{x), it is enough to search for solutions 9 such that 
the height of x9 is at most /. 
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3 Isolated forms 



A term is called quasi-flat if it has the form f''{t) where t is flat. We will now get rid of 
equalities in the first row, but by introducing quasi-flat terms instead of the flat ones. When we 
use notation f^(t) below, we assume k > 0, and f^(t) will stand for t. We eliminate equalities 
from the first row in two steps. First we will eliminate equalities among variables and /-terms 
transforming them into an equivalent set of equalities in triangle form, then we eliminate all 
other equalities in the first row. 

Consider the set S of all equalities t =ta s occurring in the first row of C chain-, where s 
and t are either variables or flat /-terms. We will transform S into an equivalent system F in 
triangle form such that all terms in F will be fiat. We assume that before the transformation F 
is empty. First we replace all equalities in S of the form /(x) =ta f{y) by x =ta U obtaining 
an equivalent system S" in which all equalities are of the form x =ta t. Now, either S" is 
unsatisfiable or there exists an equality x =ta t in S", such that x does not occur in /-terms 
of iS". We move such an equality x =ta t into F and replace all occurrences of x in S' by t, 
obtaining S" . It is easy to see that the system FUS" is equivalent to S, all terms in FUS" are 
flat, F is in triangle form and the number of variables occurring into S" is less than the number 
of variables occurring into S. Repeating this process we can eliminate all variables from S and 
obtain the required F in polynomial time. 

Now we remove from Cchain all equalities occurring in S. Let us note that variables of F 
can occur in Cchain only in the first row, and only in the terms f^{y) for < r < 1. Next we 
repeatedly replace all occurrences of dependent variables of F occurring in Cchain obtaining an 
equivalent constraint in chained form with terms of the form f^{x) where k is bounded by the 
size of F. Finally we move F into Ctriang- 

After all these transformations we can assume that equalities f^{x) =ta f"^iy) do not occur 
in the first row. 

If the first row contains an equality x =ta t between a variable and a term, we replace this 
equality by t, replace all occurrences of x by t in the first row, and add x =ta t to Ctriang 
obtaining an equivalent working constraint. Since x can occur only in the terms of the form 
f^{x), it is easy to see that these replacements can be done in polynomial time. 

If the first row contains an equality g{xi, . . . , Xm) =ta h{ti, . . . ,tn) where g and h are 
different function symbols, the constraint is unsatisfiable. 

If the first row contains an equality g{xi, . . . ,x„) =ta givi, ■ ■ ■ ,yn) we do the following. 
If the term ^(xi, . . . , x^) coincides with g{yi, . . . , yn), replace this equality by ^(xi, . . . , x„). 
Otherwise, find the smallest number i such that Xj is different from yi and 

1. add Ui =TA Xi to Ctriang] 

2. replace all occurrences of yi in Cchain by x^. 

We apply this transformation repeatedly until all equalities g{xi, . . . ,Xn) =ta ■ ■ ■ ) 2/n) 
disappear from the first row. 

So we can now assume that the first row contains no equalities and hence it has the form 
Qn >~lex Qn-1 '^^lex ■ ■ ■ >~lex Qij where all of the terms qi are quasi-flat. 

If all of the qi are variables, we can move qn >-iex Qn-i >-iex ■ ■ ■ >~iex Qi to Cgimp and add 
|<?i| > 1^1 1 to Carith obtaining an equivalent working constraint of smaller essential size. Hence, 
we can assume that at least one of the q^ is a nonvariable term. 

Take any term qk in the first row such that Qk is either a variable x or a term /^(x). Note 
that other occurrences of x in Cchain can only be in the first row, and only in the terms of the 
form /^(x). 
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Consider the formula G defined as 

V V x=TAr{9{y)). (9) 

geS-l/} m=0...l 

where y is a sequence of pairwise different new variables. Since we proved that it is enough 
to restrict ourselves to solutions 9 for which the height of x9 is at most /, the formulas C and 
C AG are equivalent up to /. 

Using the distributivity laws, C AG can be turned into an equivalent disjunction of formulas 
^ =TA f^idiv)) A C. For every such formula, replace x by f^{g{y)) in the first row, and add 
^ =TA f^{g{y)) to the triangle part. We do this transformation for all terms in the first row of 
the form f^{z), where k > and z is a variable. Now all the terms in the first row are of the 
form f"^{g{y)), where g is different from / and m > 0. 

Let us show how to replace constraints of the first row with equivalent constraints consisting 
of constraints on variables and arithmetical constraints. Consider the pair g„,g„_i. Now g„ = 
f''{g{xi, . . . ,Xu)) and qn-i = . . . , y^,)) for some variables xi, . . . ,Xu,yi, ■ ■ ■ ,yv and 

function symbols g,h€ T,-{f}. Then g„ ^^^x Qn-i is f^{g{xi,. . . >-iex f^iKyi, . . .,yv))- 
If k < m OT {k = m and h » g), then f^{g{xi, . . . , >-iex f^{h{yi, . . . , y^)) is equivalent to 
_L. li k > m or {k = m and g » h), then f^{g{xi, . . . ,Xu)) >-iex f"^{h{yi, ■ ■ ■ ,yv)) is equivalent 
to the arithmetical constraint \g{xi, . . . ,Xu)\ =n \h{yi, ■ ■ ■ ,yv)\ which can be added to Carith- 
k = m and g = h (and hence u = v), then 

f''{g{xi,...,Xu)) ^lex ^ \g{xi,---,Xu)\ =n \h{yi,...,yy)\ A 

V (xi =TA yi A . . . A Xi-i =TA yi-1 A Xi ^ yi). 

i=l...u 

We can now do the following. Add \g{xi, . . . ,Xu)\ =n \h{yi, ■ ■ ■ ,yv)\ to Carith and replace 
Qn >~iex Qn-1 with the equivalent disjunction 

V {xi =TA yi A . . . A =TA yi-i Axi>- yi). 

1=1. ..u 

Then using the distributivity laws turn this formula into the equivalent disjunction of con- 
straints of the form C Axi =ta yi A . . . A Xi-i =ta yj-i A Xj ;^ y^ for all i = 1 . . . u. For each of 
these constraints, we can move, as before, the equalities x =ta y one by one to the triangle part 
Ctriang, and make Cchain /\ Xi y yi into a disjunction of chained constraints as in Lemma 3.L 



Let us analyze what we have achieved. After these transformations, in each member of the 
obtained disjunction the first row is removed from the chained part G chain of G. Since the row 
contained at least one function symbol, each member of the disjunction will contain at least 
one occurrence of a function symbol less than the original constraint. This is enough to prove 
termination of our algorithm, but not enough to present it as a nondeterministic polynomial- 
time algorithm. The problem is that, when pn is a variable x or a term /(x), one occurrence 
of X in Pn can be replaced by one or more constraints of the form Xj y yi, where Xi and yi are 
new variables. To be able to show that the essential sizes of each of the resulting constraints is 
strictly less than the essential size of the original constraint, we have to modify our algorithm 
slightly. 

The modification will guarantee that the number of new variables introduced in the chained 
part of the constraint is not greater than the number of variables eliminated from the first row. 
We will achieve this by moving some constraints to the simple part Ggimp- 
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3 Isolated forms 



The new variables only appear when we replace a variable in the first row by a term 
f^{h{ui,. . .,Um)) orby . . . ,Vm)) obtaining a constraint f^{h{ui, . . . ,Um)) ^lex f^iK'^i, ■ 
which is then replaced by 

Ul =TA Vl^... ^ Uj_i =TA Vi^i A Ui >- Vi. (10) 

Let us call a variable Uj (respectively, Vi) new if f^{h{ui, . . . , Um)) occurred in the terms of the 
first row when we replaced a variable by a nonvariable term containing h using formula (^. In 
other words, new variables are those that did not occur in the terms of the first row before our 
transformation, but appeared in the terms of the first row during the transformation. All other 
variables are called old. After the transformation we obtain a conjunction E of constraints 
of the form Xi =ta xj or Xi >- xj, where be either new or old. Without loss of 

generality we can assume that this conjunction of constraints does not contain chains of the 
form xiT^ . . . #x„#xi where n > 2 and at least one of the 7^'s is Indeed, if E contains such 
a chain, then it is unsatisfiable. 

We will now show that the number of new variables can be restricted by moving constraints 
on these variables into the triangle or simple part. Among the new variables, let us distinguish 
the following three kinds of variables. A new variable x is called blue in E \l E contains a 
chain x =ta xi =ta • • • =TA Xn^, where Xn is an old variable. Evidently, a blue variable x 
causes no harm since it can be replaced by an old variable Xn- Let us denote by ^ the inverse 
relation to A new variable x is called red in E if it is not blue in E and E contains a chain 
x^xi^ . . . where Xn is an old variable, and all of the #'s are either =tA; or or -<. 

Red variables are troublesome, since there is no obvious way to get rid of them. However, we 
will show that the number of red variables is not greater than the number of replaced variables 
(such as the variable x in (^)). Finally, all new variables that are neither blue nor red in E are 
called green in E. 

Getting rid of the green variables. We will now show that the green variables can be 
moved to the simple part of the constraint Cgimp- To this end, note an obvious property: if E 
contains a constraint x^y and x is green, then y is green too. We can now do the following with 
the green variables. As in Lemma we can turn all the green variables into a disjunction 
of chained constraints of the form wi#...#t;„, where # are =tA) >-w, or >~iex^ and use the 
distributivity laws to obtain chained constraints . . . Let us call this constraint a green 
chain. Then, if there is any equality Vi =ta ^^i+i in the green chain, we add this equality 
to Ctriang and replace this equality by fj+i in the chain. Further, if the chain has the form 
vi ^lex ■ ■ ■ >-iex Vk Vk+i# . . . #w„, we add vi >-iex ■ ■ ■ >-iex Vk to Csimp and \vk\ > \Vk+l\ to 
Carith, and replace the green chain by Vk+i# • • • #Vn. We do this transformation until the green 
chain becomes of the form vi >-iex ■ ■ ■ >~iex Vk- After this, the green chain can be removed from 
E and added to Csimp- Evidently, this transformation can be presented as a nondeterministic 
polynomial-time algorithm. 

The red variables. Let us show the following: in every term f^{h{ui, . . . , Um)) in the first 
row at most one variable among ui, . . . ,Um is red. It is not hard to argue that it is sufficient 
to prove a stronger statement: if for some i the variable Ui is red or blue, then all variables 
Ul, . . . , nj_i are blue. So suppose that Ui is either red or blue and Uii^yn# ■ • • #yi is a shortest 
chain in E such that yi is old. We prove that the variables ui, . . . , Ui-i are blue, by induction on 
n. When n = 1 and Ui is red, E contains either Ui >- yi or yi y Ui, where yi is old. Without loss 
of generality assume that E contains Ui >- yi. Then (cf. (|To|)) this equation appeared in E when 
we replaced f^{h{ui, . . .,Um)) >-iex f''{h{vi, . . .,Vm)) by ui =ta t^iA. . .Aui-i =ta Vi-i ^Ui ^ Vi 
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and 2/1 = Vj. But then E also contains the equations ui =ta vi, . . . ,Mi_i =ta ^i-ii where the 
variables vi, . . . , Vi-i are old, and so the variables ui, . . . , Ui-i are blue. In the same way we 
can prove that if Ui is blue then ui, . . . are blue. The proof for n > 1 is similar, but we 

use the fact that . . . , Vi-i are blue rather than old. 

To complete the transformation, we add all constraints on the red and the old variables to 



C chain find make C chain into a disjunction of chained constraint as in Lemma 3.1. 

Getting rid of the blue variables. If E contains a blue variable x, then it also contains 
a chain of constraints x =ta xi =ta • • • =TA Xn, where Xn is an old variable. We replace x by 
Xn in C and add x =ta Xn to the triangle part Ctriang- 

When we completed the transformation on the first row, the row disappears from the chained 
part C chain of C. If the first row contained no function symbols, the size of Cchain will become 
smaller, since several variables will be removed from it. If Cchain contained at least one function 
symbol, that after the transformation the number of occurrences of function symbols in Cchain 
will decrease. Some red variables will be introduced, but we proved that their number is not 
greater than the number of variables eliminated from the first row. Therefore, the size of Cchain 
strictly decreases after the transformation due to elimination of at least one function symbol. 

Again, it is not hard to argue that the transformation can be presented as a nondeterministic 
polynomial-time algorithm computing all members of the resulting disjunction of constraints. 

□ 



Lemmas 3.1 and |3.5| imply the following: 



Lemma 3.6 Let C he a constraint. Then there exists a disjunction Ci V . . . V Cn of constraints 
in isolated form equivalent to C up to f . Moreover, members of such a disjunction can he found 
by a nondeterministic polynomial-time algorithm. □ 

Our next aim is to present a nondeterministic polynomial-time algorithm solving constraints 
in isolated form. 

4 From constraints in isolated form to systems of linear Dio- 
phantine inequalities 

Let C be a constraint in isolated form 

Csimp ^ CdfUh ^ Ctriang' 

Our decision algorithm will be based on a transformation of the simple constraint Csimp into 
an equivalent disjunction D of arithmetical constraints. Then we can check the satisfiability of 
the resulting formula D A Carith by using an algorithm for solving systems of linear Diophantine 
inequalities on the weights of variables. 

To transform Csimp into an arithmetical formula, observe the following. The constraint 
Csimp is a conjunction of the constraints of the form 

Xl ^lex ■ ■ ■ ^lex Xn 



having no common variables. To solve such a constraint we have to ensure that there exist at 
least different terms of the same weight as xi (since the Knuth-Bendix order is total). 
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4 Prom constrednts in isolated form to systems of linear Diophantine inequalities 



In this section we will show that for each N the statement "there exists at least N different 
terms of a weight can be expressed in the Presburger Arithmetic as an existential formula 
of one variable w. 

We say that a relation R{x) on natural numbers is 3- definable, if there exists an existential 

formula of Presburger Arithmetic C{x,y) such that R{x) is equivalent to 3yC{x,y). We call a 
function r{x) 3-definable if so is the relation r(x) = y. Note that composition of 3-definable 
function is 3-definable. 

Let us fix an enumeration gi, . . . ,gs of the signature E. We assume that the first B symbols 
gi, . . . ,gB have an arity > 2, and the first F symbols gi, . . . ,gF are nonconstants. The arity of 
each gi is denoted by arity In this section we assume that B, F , S, and the weight function 
w are fixed. 

We call the contents of a ground term t the tuple of natural numbers (ni, . . . , ns) such that 
Ui is the number of occurrences of gi in t for all i. For example, if the sequence of elements of 
S is g, h, a, b, and t = h{g{h{h{a)), g{b, 6))), the contents of t is (2, 3, 1, 2). 

Lemma 4.1 The following relation exists{x,ni, . . . ,ns) is 3-definable: there exists at least one 
ground term of E of the weight x and contents (ni, . . . , ng)- 

Proof. We will define exists{x, ni, . . . , ns) by a conjunction of two linear Diophantine inequal- 
ities. 

The first equation is 

x= ^ w{gi) ■ Hi. (11) 

l<i<5 

It is not hard to argue that this equation says: every term with the contents (ni, . . . ,ns) has 
weight X. 

The second formula says that the number of constant and nonconstant function symbols in 
(rzi, . . . , ns) is appropriately balanced for constructing a term: 

1 + ^ {aritVi -l)-ni = 0. (12) 
l<i<5 

□ 

Let us prove some lower bounds on the number of terms of a fixed weight. 

We leave the following two lemmas to the reader. The first one implies that, if there exists 
any ground term t of a weight x with at least N occurrences of nonconstant symbols, including 
at least one occurrence of a function symbol of an arity > 2, then there exists at least N different 
ground terms of the weight x. 

Lemma 4.2 Let x, ni, . . . ,ns be natural numbers such that exists{x, ni, . . . , ns) holds, ni + . . .+ 
ns > 1 and ni + . . . + np > N . Then there exists at least N different ground terms with the 
contents (ni, . . . , ns)- □ 

The second lemma implies that, if there exists any ground term t of a weight x with at least 
N occurrences of nonconstant function symbols, including at least two different unary function 
symbols, then there exists at least A'' different ground terms of the weight x. 
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Lemma 4.3 Let x, ni, . . . , be natural numbers such that exists{x, ni, . . . ,ns) holds, ni + . . .+ 
np > N and at least two numbers among ns+i, . . . ,nF are positive. Then there exists at least 
N different ground terms with the contents (ni, . . . ,ns)- Q 

Let us note that if our signature consists only of a unary function symbol of a positive weight 
and constants, then the number of different terms in any weight is less or equal to the number 
of constants in the signature. 

The remaining types of signatures are covered by the following lemma. 



Lemma 4.4 Let S contain a function symbol of an arity greater than or equal to 2, or contain 
at least two different unary function symbols. Then there exist two natural numbers Ni and N2 
such that for all natural numbers N and x such that x > N ■ Ni + N2, the number of terms of 
the weight x is either or greater than N . 

Proof. If S contains a unary function symbol of the weight then the number of different 
terms of any weight is either or tj and the lemma trivially holds. 

Therefore we can assume that our signature contains no unary function symbol of the weight 
0. Define 



W = ma^{w{gi)\l < i < S}; 

A = max{arity^\l <i< S}; 

Ni = W-A; 

N2 = ■{A + l) + W. 



Take any and x such that x > • A'^i + • 

Let us prove that if there exists a term of the weight x then the number of occurrences of 
nonconstant function symbols in this term is greater than N. Assume the opposite, i.e. there 
exists a term t of the weight x such that the number of occurrences of nonconstant function 
symbols in t is M < A'^. Let (ni, . . . ,ns) be the contents of t and L denote the number of 
occurrences of constants in t. Note that (|l^) implies L = 1 + '^i<ii<:p{a.rity ^ — 1) • n^. Then 
using ( pr] ) we obtain 

iV • iVi + iV2 < lt| = El<i<5 H9i) -UiKW- El<^<S = 

W ■{M + L) = W -{^1 + 1 + Ei<i<Fi'^rity, - 1) • n,) < 
W-iM + l + {A-l) Zi<i<F ni) = W-iM + l + {A-l)-M) = 
W ■ {M ■ A + 1) <W ■ {N ■ A + 1) < N ■ Ni + N2. 

So we obtain a contradiction. 

Consider the following possible cases. 



1. There exists a term of the weight x with an occurrence of a function symbol of an arity 
greater than or equal to 2. In this case by Lemma 4^ the number of different terms of 
the weight x is greater than A^. 



2. There exists a term of the weight x with occurrences of at least two different unary function 
symbols. In this case by Lemma O the number of different terms of the weight x is greater 
than A^. 
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4 From constraints in isolated form to systems of linear Diophantine inequalities 



3. All terms of the weight x have the form g^{c) for some unary function symbol g and a 
constant c. We show that this case is impossible. In particular, we show that for any 
nonconstant function symbol h there exists a term of the weight x in which g and h occur, 
therefore we obtain a contradiction with the assumption. 

We have x = w{g)-k+w[c). Denote by H the arity of h. Let us define integers Mi, M2, M3 
as follows 



Ml = w{g), 

M2 = k- w{h) - wic) ■ (H - 1), 
M3 = w{g){H-l) + l. 

Let us prove that Mi , M2 , M3 > and there exists a term of the weight x with Mi occur- 
rences of h, M2 occurrences of g and M3 occurrences of c and hence obtain a contradiction. 

Since g is unary, w{g) > 0, and so Mi > 0. Since H > 1, we have M3 > 0. Let us show 
that M2 > 0, i.e. k > w{h) + w{c) ■ {H - 1). We have 



k = {x- w{c))/w{g) > {N ■N1+N2- w{c))/w{g) > 
{N2 - w{c))/w{g) = {W^ ■{A + l) + W - w{c))/w{g) > 
{W^ ■ {A + l))/w{g) >W-{A + l) = W + W-A> 
w{h) + w{c) ■ A > w{h) + w{c) ■ {H - 1). 

It remains to show that there exists a term of the weight x with Mi occurrences of /i, M2 
occurrences of g and M3 occurrences of c. To this end we have to prove (cf. ( pT|) and (|l^)) 

X = w{h) ■ Ml + w{g) ■ M2 + w{c) ■ M3, 

1 + (i? - 1) • Ml + (1 - 1) • M2 + (0 - 1)M3 = 0. 

This equalities can be verified directly by replacing Mi,M2,M3 by their definitions and 
X by w{g) ■ k + w{c). □ 



Define the binary function tnt (truncated number of terms) as follows: tnt{N,M) is the 
minimum of N and the number of terms of the weight M and let us show that tnt can be 
computed in time polynomial of A'^ + M. To give a polynomial-time algorithm for this function 
we need an auxiliary definition and a lemma. 

Definition 4.5 Let (ni, . . . , Ug) and (mi, . . . , rus) be two tuples of natural numbers. We say 
that (ni, . . . , Ug) extends {mi, . . . , m^) if nj > mi for 1 < i < s. □ 

The depth of a term is defined by induction as usual: the depth of every constant is 1 and 
the depth of every nonconstant term g{ti, . . . ,tn) is equal to the maximum of the depth of the 
tj's plus 1. 

Lemma 4.6 Let ti, . . . ,tn be a collection of different terms of the same depth and Con be the 
contents of a term such that Con extends the contents of all terms ti, 1 < i < n. Then there 
exists at least n different terms with the contents Con. 
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Proof. Let us define the notion of leftmost subterm of a term t as follows: every constant c 
has only one leftmost subterm, namely c itself, and leftmost subterms of a nonconstant term 
g{ri, . . . ,rn) are this term itself and all leftmost subterms of ri. Evidently, for each positive 
integer d and term t, t has at most one leftmost subterm of the depth d. 

It is not hard to argue that from the condition of the lemma it follows that for every term 
ti there exists a term Si with the contents Con such that ti is a leftmost subterm of Sj. But 
then the terms si, . . . , s„ are pairwise different, since they have different leftmost subterms of 
the depth d. □ 



Lemma 4.7 Let the signature T, contain no unary function symbol of the weight and contain 
either a function symbol of an arity greater than or equal to 2 or contain at least two different 
unary function symbols. Then the function tnt{N,M) is computable in time polynomial of 
M + N. 

Proof. It is not hard to argue that for every contents (ni, . . . , us) such that some of the n^'s 
is greater than M, any term with these contents has the weight greater than M. The number 
of different contents in which each of the nj's is less or equal than M is , i.e. it is polynomial 
in M, moreover, all these contents can be obtained by an algorithm working in time polynomial 
in M. 

Therefore it is sufficient to describe a polynomial-time algorithm which for all contents 
(ni, . . . ,ns), where 1 < nj < M, returns the minimum of N and the number of terms with 
these contents. 

Let us fix contents Con = (ni, . . . us) where 1 < < M. Using equations (^) and (12), 



one can check in polynomial time whether there exists a term with the contents Con, so we 
assume that there exists at least one such term. 

Our algorithm constructs, step by step, sets Tq, Ti, . . ., of different terms with contents which 
can be extended to the contents Con. Each set Tj will consist only of terms of the depth i. 

1. Step 0. Define Tq = 0. 

2. Step i + \. Define 



Ti+i = {g{ti,. . . ,tm) I 5^5], fi, . . . G Ti U . . . U Tj, 

Con extends the content of (7(ti, . . . , tm); and 
the depth of . . . , tm) is z + 1}. 



If Tj_|_i has N or more terms, then by Lemma L6 there exists at least different terms of 
the content Con, so we terminate and return N . If Tj+i is empty, we return as the result 
the minimum of and the number of terms with the content Con in Ti U . . . U Tj+i. 

Let us prove some obvious properties of this algorithm. 

1. // some Ti contains N or more terms, then there exists at least N terms with the content 
Con. As we noted, this follows from Lemma 4.6. 

2. At the end of step i + 1 the set Ti U . . . Tj+i contains all the terms with the contents Con 
of the depth < i + 1. This property obviously holds by our construction. 
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This property ensure that the algorithm is correct. To prove that it works in time polynomial 
in M + A'^ it is enough to note that each step can be made in time polynomial in and the 
total number of steps is at most M + 1. □ 

Now we are ready to prove the main lemma of this section. 

Lemma 4.8 There exists a polynomial time of N algorithm, which constructs an existential 
formula atJeasti^(x) valid on a natural number x if and only if there exists at least N different 
terms of the weight x. 

Proof. If the signature E contains a unary function symbol of the weight then the number 
of different terms in any weight is either or to. Therefore we can define at -least is[{x) as 
3ni . . . Bns exists (x, ni, . . . , us). 

Let us consider the case when the signature S consists of a unary function symbol 5 of a 
positive weight. For every constant c in S consider the formula Gc{x) = 3k{w{g)k + w{c) = x). 
It is not hard to argue that Gc{x) holds if and only if there exists a term of the form g^{c). 
Let P be the set of all sets of cardinality consisting of constants of S (the cardinality of P 
is obviously polynomial in N). It is easy to see that 

atJeast^ix) <-> \J ^ Gc{x). 

QGPQG5 

It remains to consider the case when our signature contains a function symbol of an arity greater 



than or equal to 2, or contain at least two different unary function symbols. By Lemma i.4, 
there exist constants A*"! and N2 such that for any natural number x such that x > A^ • A"! + A'^2 
the number of terms of the weight x is either or greater than A^. Let us denote A^ • A^i + A'^2 
as M and the set {M'\M' < M A tnt{N, M') > N] as W. By Lemmas U, we have 



atJeast]\f{x)^{3ni,...,nsexists{x,ni,...,ns) Ax>M)\J{ \J x = M'). 

M'£W 

□ 



5 Main results 

In this section we complete the proofs of the main results of this paper. 

Theorem 5.1 For every Knuth-Bendix order, the problem of solving ordering constraints is 
contained in NP. 



Proof. Take a constraint. By Lemma |3.5| it can be effectively transformed into an equivalent 
disjunction of isolated forms, so it remains to show how to check satisfiability of constraints in 
isolated form. 

Suppose that C is a constraint in isolated form. Recall that C is of the form 

Carith ^ Ctriang ^ Cgimp- (1^) 

Let Csimp contain a chain xi )^iex ■ ■ ■ >-iex xn such that xi, . . . ,xn does not occur in the 
rest of Csimp- Denote by C^j„p the constraint obtained from Csimp by removing this chain. It 
is not hard to argue that C is equivalent to the constraint 



K. Korovin and A. Voronkov. KBO constraint solving is NP-compete 



21 



Carith A Ctriang A C^j^p A f\ {\Xi\ =N |xi|) A atJeastN{\xi\). 

i=2...N 

In this way we can replace Csimp by an arithmetical constraint, so we assume that Csimp is 
empty. Let Ctriang have the form 

2/1 =TA ti A . . . A y„ =TA tn- 

Let Z be the set of all variables occurring in Carith A Ctriang- It is not hard to argue that 
Carith A Ctriang is satisfiable if and only if the following constraint is satisfiable: 

Carith A =N A . . . A \yn\ =N \tn\ A A^gz atjeasti{\z\) . 

So we reduced the decidability of the existential theory of term algebras with a Knuth-Bendix 
order to the problem of solvability of systems of linear Diophantine inequalities. Our proof can 
be represented as a nondeterministic polynomial-time algorithm. 

□ 

This theorem implies the main result of this paper. Let us call a signature T, trivial if it 
consists of one constant symbol. Evidently, the first-order theory of the term algebra of a trivial 
signature is polynomial. 

Theorem 5.2 The existential first- order theory of any term algebra of a non-trivial signature 
with the Knuth-Bendix order is NP-complete. 

Proof. The containment in NP follows from Theorem |5.1| . It is easy to prove NP-hardness by 
reducing propositional satisfiability to the existential theory of the algebra (even without the 
order) . □ 

Let us show that for some Knuth-Bendix orders even constraint solving can be NP-hard. 

Example 5.3 Consider the signature S = {s,(7,/i,c}, where h is binary, s,g are unary, and c 
is a constant. Define the weight of all symbols as 1, and use any order ^ on S such that g ^ s. 
Our aim is to represent any linear Diophantine equation by Knuth-Bendix constraints. To this 
end, we will consider any ground term t as representing the natural number |t| — 1. 
Define the formula 

equal -weight{x,y) 

g{x) >- s{y) Ag{y) >~ s{x). 

It is not hard to argue that, for any ground terms r,t equal -weight {r,t) holds if and only if 
It is enough to consider systems of linear Diophantine equations of the form 



Xi -\- . . . + Xn + k = Xo, (14) 

where xq, . . . ,Xn are pairwise different variables, and /c G N. Consider the constraint 



22 5 Main results 



equal -weight{s^^^{h{yi, h{y2, • • • , 

Kyn-i,yn)))), 

^'"(yo)). 

It is not hard to argue that 

(16) Formula ( p!5| ) holds if and only if 

|yi| - 1 + . . . + \yn\ - l + k= jyol - 1- 

Using (|16D, we can transform any system D{xi, . . . , Xn) of linear Diophantine equations of the 
form ( p^ ) into a constraint C{yi, . . . ,yn) such that for every tuple of ground terms ti, . . . , t„, 
C{ti, . . . ,tn) holds if and only if so does D{\ti \ — 1, ... , — 1). 
Similar, using a formula 

greater _weight{x,y) <-> 
s{x) >- g{y) 

one can represent systems of linear inequalities using Knuth-Bendix constraints. □ 

Since it is well-known that solving linear Diophantine equations is NP-hard, we have the 
following theorem. 

Theorem 5.4 For some Knuth-Bendix orders, the problem of solving ordering constraints is 
NP-complete. □ 

This result does not hold for all non-trivial signatures, as the following theorem shows. 

Lemma 5.5 There exists a polynomial time algorithm which solves ordering constraints for any 
given term algebra over a signature consisting of constants and any total ordering >- on that 
term algebra. 

Proof. Let S = {ci, . . . ,c„}, w.l.o.g. we can assume that c„ >~ c„_i >~ ... y ci. Let C be 
an ordering constraint. First we get rid of equalities as follows. If t =ta s occurs in C and t 
syntactically equal to s then we remove t =ta s from C, if t is a variable then we replace all 
occurrences of t in C by s and remove t =ta s from C, otherwise t and s are different constants 
and C is unsatisfiable. Now C consists of conjunctions of atomic formulas of the form t y s. 
We define a relation y'^j on terms as follows: t )^'q s if and only if t y s occurs into C. Let 
denote a transitive closure of It is easy to see, that using a polynomial time algorithm 
for transitive closure, we can compute the relation t yc s in polynomial time. Note that if 
is not a strict order then the constraint C is unsatisfiable. So we assume that is a strict 
partial order. 

Now we replace all variables in C by constants as follows. Take a variable x such that there 
is no variable less than x w.r.t. ^c- There are two possible cases: 

1. X is a minimal term w.r.t. then we replace all occurrences of x in C by ci. 

2. there exist some constants less than x w.r.t. J^c", then let Cmax be the greatest w.r.t. 
>- constant among such constants. If Cmax is the maximal constant in TA(S) then the 
constraint C is unsatisfiable, otherwise we replace all occurreiicGS of x by Cfyiax-i-i- 



(15) 
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Repeating this process we replace all variables in C in polynomial time. To complete the 
proof of the lemma, it remains to show that transformations |l|,§ above, preserve satisfiability 
of constraints without equality. To this end, we consider a constraint C without equality and a 
solution 6 to C. If the transformation |l] is applicable to C then it is easy to see that 

N _ f ci, if X is a minimal w.r.t. 
\ 9{x) otherwise. 

is a solution to the constraint obtained after applying the transformation |^ to C. 

Similar one can show that the transformation || preserves satisfiability of constraints without 
equality. 

□ 



Corollary 5.6 There exists a polynomial time algorithm which checks solvability of ordering 
constraints for any given Knuth-Bendix order on any term algebra over a signature consisting 
of constants. □ 

As we mentioned in Section ^, if we consider real-valued Knuth-Bendix orders then even 
comparison of ground terms might be undecidable. Let us show it on the following example. 

Example 5.7 Consider a non-computable real number r such that < r < 1, i.e. there is no 
algorithm which given a positive integer n computes r with the precision 1/n, in other words 
finds two natural numbers p, q such that |r — p/q\ < 1/n. 

Now we consider a signature consisting of two unary symbols g, h and a constant c and 
consider any Knuth-Bendix order >- on the corresponding term algebra, such that w{g) = 1 
and w{h) = r. Let us show that comparison of terms in this Knuth-Bendix order is undecidable. 
Consider a positive integer n. Then, it is easy to see that there exists a positive integer m such 
that 5™(c) >- h'^ic) y g^'-^ic). Since |5'"(c)| ^ |/i"(c)| / b'^-^c)!, we have |5™(c)| > |/i"(c)| > 
|(7™~^(c)|. From the definition of the weight function we have that m > rn > m — 1 and 
therefore m/n > r > Let us take p = m — 1 and q = n, then we have \r — p/q\ < 1/n. 

Therefore using comparison of terms we can compute r with the precision 1/n. This implies 
that comparison of terms for this Knuth-Bendix order is undecidable. □ 



6 Related work and open problems 

In this section we overview previous work on Knuth-Bendix orders, recursive path orders, and 
extensions of term algebras with various relations. 

The Knuth-Bendix order was introduced in [Knuth and Bendix 1970]. Later, Dershowitz 
[1982] introduced recursive path orders (RPOs) and Kamin and Levy [1980] lexicographic path 
orders (LPOs). A number of results on recursive path orders and solving LPO and RPO ordering 
constraints are known. 

However, except for the very general result of [Nieuwenhuis 1993] the techniques used for 
RPO constraints are not directly applicable to Knuth-Bendix orders. We used systems of linear 



Diophantine inequalities in our decidability proofs. This is not coincidental: Example 5.3 shows 
that systems of linear Diophantine inequalities are definable in the Knuth-Bendix order. 

Comon and Treinen [1994] proved that LPO constraint solving is NP-hard already for con- 
straints consisting of a single inequality. In [Korovin and Voronkov 2001] we prove that the 
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problem of solving Knuth-Bendix ordering constraints consisting of a single inequality can be 
solved in polynomial time. 

In [Korovin and Voronkov 2001] we present a polynomial time algorithm for the orientability 
problem: given a system of rewrite rules R, does there exist a Knuth-Bendix order which orients 
every groTind instance of every rewrite rule in R. A similar problem of orientability by the non- 
ground version of the real valued Knuth-Bendix order was studied by Dick, Kalmus, and Martin 
[Martin 1987, Dick, Kalmus and Martin 1990] and an algorithm for orientability was given. 
Algorithms for, and complexity of, orientability problem for various versions of the recursive 
path orders were considered in [Lescanne 1984, Detlefs and Forgaard 1985, Krishnamoorthy and 
Narendran 1985]. In particular, in [Krishnamoorthy and Narendran 1985] it is shown that the 
orientability problem by the non-ground version of the recursive path order is NP-complete. 

Comon [1990] proved the decidability and Nieuwenhuis [1993] NP-completeness of LPO 
constraint solving. Jouannaud and Okada [1991] proved the decidability and Narendran et al. 
[1999] NP-completeness of RPO constraint solving. Recently, Nieuwenhuis and Rivero [1999] 
proposed a new efficient method for solving RPO constraints. 

Lepper [2001] studies derivation length and order types of Knuth-Bendix orders, both for 
integer- valued and real- valued weight functions. 

Term algebras are rather well-studied structures. Malcev [1961] was the first to prove the 
decidability of the first-order theory of term algebras. Other methods of proving decidability 
were developed by Comon and Lescanne [1989], Kunen [1987], Belegradek [1988], Maher [1988]. 

If we introduce a binary predicate into a term algebra, then one can obtain a richer the- 
ory. Term algebras with the subterm predicate have an undecidable first-order theory and 
a decidable existential theory [Venkataraman 1987]. Term algebras with lexicographic path 
orders have an undecidable first-order theory [Comon and Treinen 1997]. However, if we con- 
sider term algebras over signatures consisting of unary symbols and constants then the first- 
order theory of lexicographic path orders over such term algebras is decidable [Narendran and 
Rusinowitch 2000]. In [Korovin and Voronkov 2002] we show that the first-order theory of 
any Knuth-Bendix order over any term algebra over a signature consisting of unary function 
symbols and constants is decidable. 

To conclude, we mention two open problems related to the Knuth-Bendix order. One 
problem is whether whole first-order theory of the Knuth-Bendix orders is decidable. Another 
problem is to describe the complexity of the constraint solving problem for Knuth-Bendix orders 
in the case of signatures consisting of unary function symbols and constants. 
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